445 research outputs found
Fixed-Point Performance Analysis of Recurrent Neural Networks
Recurrent neural networks have shown excellent performance in many
applications, however they require increased complexity in hardware or software
based implementations. The hardware complexity can be much lowered by
minimizing the word-length of weights and signals. This work analyzes the
fixed-point performance of recurrent neural networks using a retrain based
quantization method. The quantization sensitivity of each layer in RNNs is
studied, and the overall fixed-point optimization results minimizing the
capacity of weights while not sacrificing the performance are presented. A
language model and a phoneme recognition examples are used
A Parallel Decomposition Scheme for Solving Long-Horizon Optimal Control Problems
We present a temporal decomposition scheme for solving long-horizon optimal
control problems. In the proposed scheme, the time domain is decomposed into a
set of subdomains with partially overlapping regions. Subproblems associated
with the subdomains are solved in parallel to obtain local primal-dual
trajectories that are assembled to obtain the global trajectories. We provide a
sufficient condition that guarantees convergence of the proposed scheme. This
condition states that the effect of perturbations on the boundary conditions
(i.e., initial state and terminal dual/adjoint variable) should decay
asymptotically as one moves away from the boundaries. This condition also
reveals that the scheme converges if the size of the overlap is sufficiently
large and that the convergence rate improves with the size of the overlap. We
prove that linear quadratic problems satisfy the asymptotic decay condition,
and we discuss numerical strategies to determine if the condition holds in more
general cases. We draw upon a non-convex optimal control problem to illustrate
the performance of the proposed scheme
B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives
Previous research addressed the potential problems of the hard-disk oriented
design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential
benefits of flashSSDs. First, we examine the internal parallelism issues of
flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest
algorithm-design principles in order to best benefit from the internal
parallelism. We present a new I/O request concept, called psync I/O that can
exploit the internal parallelism of flashSSDs in a single process. Based on
these ideas, we introduce B+-tree optimization methods in order to utilize
internal parallelism. By integrating the results of these methods, we present a
B+-tree variant, PIO B-tree. We confirmed that each optimization method
substantially enhances the index performance. Consequently, PIO B-tree enhanced
B+-tree's insert performance by a factor of up to 16.3, while improving
point-search performance by a factor of 1.2. The range search of PIO B-tree was
up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree
outperformed other flash-aware indexes in various synthetic workloads. We also
confirmed that PIO B-tree outperforms B+-tree in index traces collected inside
the Postgresql DBMS with TPC-C benchmark.Comment: VLDB201
SME supply chain collaboration innovation using an online hub
노트 : Proceedings of the 8th International Conference on Innovation & Managemen
- …